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1 Introduction 

The development of mathematics towards greater exactness has, as is well-known, lead 
to formalization of large areas of it such that you can carry out proofs by following a few 
mechanical rules. The most comprehensive current formal systems are the system of 
Principia Mathematica (PM) on the one hand, the Zermelo-Fraenkelian axiom-system 
of set theory on the other hand. These two systems are so far developed that you 
can formalize in them all proof methods that are currently in use in mathematics, i.e. 
you can reduce these proof methods to a few axioms and deduction rules. Therefore, 
the conclusion seems plausible that these deduction rules are sufficient to decide all 
mathematical questions expressible in those systems. We will show that this is not true, 
but that there are even relatively easy problem in the theory of ordinary whole numbers 
that can not be decided from the axioms. This is not due to the nature of these systems, 
but it is true for a very wide class of formal systems, which in particular includes all 
those that you get by adding a finite number of axioms to the above mentioned systems, 
provided the additional axioms don't make false theorems provable. 

Let us first sketch the main intuition for the proof, without going into detail and 
of course without claiming to be exact. The formulae of a formal system (we will 
restrict ourselves to the PM here) can be viewed syntactically as finite sequences of 
the basic symbols (variables, logical constants, and parentheses or separators), and it is 
easy to define precisely which sequences of the basic symbols are syntactically correct 
formulae and which are not. Similarly, proofs are formally nothing else than finite 
sequences of formulae (with specific definable properties). Of course, it is irrelevant 
for meta-mathematical observations what signs are taken for basic symbols, and so we 
will chose natural numbers for them. Hence, a formula is a finite sequence of natural 
numbers, and a proof schema is a finite sequence of finite sequences of natural numbers. 
The meta-mathematical concepts (theorems) hereby become concepts (theorems) about 
natural numbers, which makes them (at least partially) expressible in the symbols of the 
system PM. In particular, one can show that the concepts "formula" , "proof schema" , 
"provable formula" are all expressible within the system PM, i.e. one can, for example, 



come up with a formula F(v) of PMthat has one free variable v (whose type is sequence 
of numbers) such that the semantic interpretation of F(v) is: v is a provable formula. 
We will now construct an undecidable theorem of the system PM, i.e. a theorem A for 
which neither A nor ->A is provable, as follows: 

We will call a formula of PM with exactly one free variable of type natural numbers a 
class-sign. We will assume the class-signs are somehow numbered, call the nth one R n , 
and note that both the concept "class-sign" and the ordering relation R are definable 
within the system PM. Let a be an arbitrary class-sign; with a(n) we denote the formula 
that you get when you substitute n for the free variable of a. Also, the ternary relation 
x y(z) is definable within PM. Now we will define a class K of natural numbers as 
follows: 

K = {n G IN | ^provable(R n (n))} (1) 

(where provable(x) means x is a provable formula). With other words, K is the set of 
numbers n where the formula R n (n) that you get when you insert n into its own formula R n 
is improvable. Since all the concepts used for this definition are themselves definable in 
PM, so is the compound concept K, i.e. there is a class-sign S such that the formula 
S(n) states that n e K. As a class-sign, S is identical with a specific R q , i.e. we have 

S 4=r- Rq 

for a specific natural number q. We will now prove that the theorem R q {q) is 
undecidable within PM. We can understand this by simply plugging in the definitions: 
Rq(l) ^ t$ q £ K 4$ -^provable(R q (q)) , in other words, R q (q) states "I am improvable." 
Assuming the theorem R q (q) were provable, then it would also be true, i.e. because of 
(1) -^provable(R q (q)) would be true in contradiction to the assumption. If on the other 
hand ->R q (q) were provable, then we would have q 0 K, i.e. provable(R q (q)) . That 
means that both R q (q) and ->R q (q) would be provable, which again is impossible. 

The analogy of this conclusion with the Richard- antinomy leaps to the eye; there 
is also a close kinship with the liar-antinomy, because our undecidable theorem R q (q) 
states that q is in K, i.e. according to (1) that R q (q) is not provable. Hence, we have 
in front of us a theorem that states its own unprovability. The proof method we just 
applied is obviously applicable to any formal system that on the one hand is expressive 
enough to allow the definition of the concepts used above (in particular the concept 
"provable formula"), and in which on the other hand all provable formulae are also 
true. The following exact implementation of the proof will among other things have 
the goal to replace the second prerequisite by a purely formal and much weaker one. 

From the remark that R q {q) states its own improvability it immediately follows that 
R q (q) is correct, since R q (q) is in fact unprovable (because it is undecidable). The 
theorem which is undecidable within the system PM has hence been decided by meta- 
mathematical considerations. The exact analysis of this strange fact leads to surprising 
results about consistency proofs for formal systems, which will be discussed in section 
4 (theorem XI). 



2 Main Result 



We will now exactly implement the proof sketched above, and will first give an exact 
description of the formal system P, for which we want to show the existence of unde- 
cidable theorems. By and large, P is the system that you get by building the logic of 
PM on top the Peano axioms (numbers as individuals, successor-relation as undefined 
basic concept). 

2.1 Definitions 

The basic signs of system P are the following: 

I. Constant: (not), "V" (or), "V" (for all), "0" (zero), u succ n (the successor of), 
"("> ")" (parentheses). Godel's original text uses a different notation, but the reader 
may be more familiar with the notation adapted in this translation. 

II. Variable of type one (for individuals, i.e. natural numbers including 0): "xi", 

Li !! u » 

Vi , Z\ , • • ■ 

Variables of type two (for classes of individuals, i.e. subsets of IN): "x 2 ", "2/2", 'W , 
Variables of type three (for classes of classes of individuals, i.e. sets of subsets of 

tivA. it n it„. n a » 
m )- %3 , 2/3 , Z S , ... 

And so on for every natural number as type. 

Remark: Variables for binary or n-ary functions (relations) are superfluous as ba- 
sic signs, because one can define relations as classes of ordered pairs and ordered pairs 
as classes of classes, e.g. the ordered pair (a, b) by {{a}, {a, b}}, where {x,y} and {x} 
stand for the classes whose only elements are x, y and x, respectively. 

By a sign of type one we understand a combination of signs of the form 

a, succ(a), succ(succ(a)) , succ(succ(succ(a))) , . . . etc., 

where a is either 0 or a variable of type one. In the first case we call such a sign a 
number-sign. For n > 1 we will understand by a sign of type n a variable of type n. 
We call combinations of signs of the form a(b), where b is a sign of type n and a a sign 
of type n+l, elementary formulae. We define the class of formulae as the smallest 
set that contains all elementary formulae and that contains for a,b always also ->(a), 
(a) V (6), Vx . (a) (where x is an arbitrary variable). We call (a) V (6) the disjunction 
of a and b, ~<{a) the negation and Vx . (a) the generalization of a. A formula that 
contains no free variables (where free variables is interpreted in the usual manner) is 
called proposition-formula. We call a formula with exactly n free individual- variables 
(and no other free variables) an n-ary relation sign, for n — 1 also class-sign. 

By substam (where a is a formula, v is a variable, and b is a sign of the same type 
as v) we understand the formula that you get by substituting b for every free occurrence 



of v in a. We say that a formula a is a type-lift of another formula b if you can obtain 
a from b by increasing the type of all variables occurring in a by the same number. 

The following formulae (I through V) are called axioms (they are written with the 
help of the abbreviations (defined in the usual manner) A, 3x, =, and using the 

customary conventions for leaving out parentheses): 

I. The Peano axioms, which give fundamental properties for natural numbers. 

1. -i(succ(xi) = 0) We start to count at 0. 

2. succ(xi) = succ(yi) =>■ X\ — y\ If two natural numbers X\p £ IN have the same 
successor, they are equal. 

3. (x 2 (0) A Vxi . x 2 (xi) =^ x 2 (succ(xi)) J =>- Vxi . x 2 (xi) We can prove a predicate 



II. Every formula obtained by inserting arbitrary formulae for p, q, r in the following 
schemata. We call these proposition axioms. 

1. p\/ p => p 

2. p =^ p V q 

3. p V q =^ q V p 

4. {p =^ g) (r V p =^ r V q) 

III. Every formula obtained from the two schemata 



2. (Vu . 6 V a) (6 V W . a) 

by inserting the following things for a, i>, 6, c (and executing the operation denoted 
by subst in 1.): 

Insert an arbitrary formula for a, an arbitrary variable for v, any formula where v 
does not occur free for b, and for c a sign of the same type as v with the additional 
requirement that c does not contain a free variable that would be bound in a 
position in a where v is free. 

For lack of a better name, we will call these quantor axioms. 
IV. Every formula obtained from the schema 

1. 3m. Vf. (u(v) & a) 

by inserting for v and u any variables of type n and n + 1 respectively and for 
a a formula that has no free occurrence of u. This axiom takes the place of the 
reducibility axiom (the comprehension axiom of set theory). 



x 2 on natural numbers by natural induction. 



[178] 




5 



V. Any formula obtained from the following by type-lift (and the formula itself): 
1. (Vxi . (x 2 (x 1 ) y 2 (x 1 ))) x 2 = y 2 

This axiom states that a class is completely determined by its elements. Let us 
call it the set axiom. 

A formula c is called the immediate consequence of a and b (of a) if a is the formula 
->b V c (or if c is the formula Vt> . a, where v is any variable). The class of provable 
formulae is defined as the smallest class of formulae that contains the axioms and is 
closed under the relation "immediate consequence" . 

2.2 Godel- numbers 

We will now uniquely associate the primitive signs of system P with natural numbers 
as follows: 

"0" ... 1 "swcc" ... 3 "-.» ... 5 
"V...7 "V...9 "("...11 

")" ... 13 

Furthermore we will uniquely associate each variable of type n with a number of the 
form p n (where p is a prime > 13). Thus there is a one-to-one-correspondence between 
every finite string of basic signs and a sequence of natural numbers. We now map the 
sequences of natural numbers (again in one-to-one correspondence) to natural numbers 
by having the sequence n\, n 2 , . . . , correspond to the number 2 ni -3 n2 •. . . ■p r ^. k , where pk 
is the kth prime (by magnitude). Thus, there is not only a uniquely associated natural 
number for every basic sign, but also for every sequence of basic signs. We will denote 
the number associated with the basic sign (resp. the sequence of basic signs) a by $(a). 
Now let R(a\, a 2 , . . . , a n ) be a given class or relation between basic signs or sequences 
of them. We will associate that with the class (relation) R'(x\, x 2 , ■ ■ ■ , x n ) that holds 
between x\, x 2 , . . . , x n if and only if there are d\, a 2 , . . . , a n such that for i — 1, 2, . . . , n 
we have Xi = $(a«) and the R(ai,a 2 , . . . ,a n ) holds. We will denote the classes and 
relations on natural numbers which are associated with the meta-mathematical con- 
cepts, e.g. "variable", "formula", "proposition-formula", "axiom", "provable formula" 
etc., in the above mentioned manner, by the same word in small caps. The proposition 
that there are undecidable problems in system P for example reads like this: There are 
PROPOSITION-FORMULAE a, such that neither a nor the NEGATION of a is a PROVABLE 
FORMULA. 

2.3 Primitive recursion 

At this point, we will make an excursion to make an observation that a priori does not 
have anything to do with the system P, and will first give the following definition: we 
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say a number-theoretical formula <j)(xi,x 2 , ■ ■ ■ ,x n ) is defined via primitive recursion in 
terms of the number-theoretical formulae if>(xi, x 2 , . . . , x n _i) and fi(xi, x 2 , . . . , x n+ i) if 
the following holds for all x 2 , ■ ■ ■ , x n , k: 



We call a number-theoretical formula 0 primitive recursive if there is a finite se- 
quence of number-theoretical formulae <f>i, (f> 2 , ■ ■ ■ , 0 n ending in 0 such that every func- 
tion 0^ of the sequence is either defined from two of the preceding formulae by primiti- 
ve recursion or results by inserting into any of the preceding ones or, and this is the base 
case, is a constant or the successor function succ(x) = x + 1. The length of the shortest 
sequence of 0j belonging to a primitive recursive function 0 is called its degree. We 
call a relation R(x\, . . . , x n ) primitive recursive if there is a primitive recursive function 
0(xi, . . . , x n ) such that for all xi, x 2 , ■ ■ ■ , x n , 



The following theorems hold: 

I. Every function (relation) that you get by inserting primitive recursive functions 
in the places of variables of other primitive recursive functions (relations) is itself 
primitive recursive; likewise every function that you get from primitive recursive 
functions by the schema (2). 

II. If R and S are primitive recursive relations, then so are ->R,R\/S (and therefore 
also R A S). 

III. If the functions (f>(x),ip(y) are primitive recursive, then so is the relation 0(5;) = 
ip{y) ■ We have resorted to a vector notation x to denote finite-length tuples of variables. 

IV. If the function 0(x) and the relation R(y, z) are primitive recursive, then so are 
the relations S, T 



0(0, x 2 , ■ ■ .,x n ) = ip(x 2 , . ■ .,x n ), 
0(fc + 1, x 2 , . . . , x n ) = /j,(k, (f)(k, x 2 ,...,x n ),x 2 ,..., x n ) 



(2) 



R(x 1 , . . . 



X n ) & (0(Xi, . . . 



X n ) = 0). 



S{x,z)& (3y<<P(x). R{y, zj) 



T(x,z) & (Vy < 0(f) .R(y,z)) 



as well as the function ip 




where argminx < f{x) . F(x) stands for the smallest x for which (x < f(x)) A 
F(x) holds, and for 0 if there is no such number. Readers to whom an operational 



description appeals more may want to think of this as a loop that tries every value from 
1 to (f>(x) to determine the result. The crucial point here is this theorem does not state 
that an unbounded loop (or recursion) is primitive recursive; those are in fact strictly 
more powerful in terms of computability. 

Theorem I follows immediately from the definition of "primitive recursive" . Theo- 
rems II and III are based upon the fact that the number-theoretical functions 

a(x),f3(x,y),7(x,y) 

corresponding to the logical concepts V, = (where n = 0 is taken for true and n / 0 
for false), namely 

1 for x = 0 
0 for x 7^ 0 

0 if one or both of x, y are = 0 

1 if both x, y are 7^ 0 

0 if x = y 

1 if x ^ y 



a(x) = 
P(x,y) = 



are primitive recursive, as one can easily convince oneself. The proof for theorem 
IV is, in short, the following: by assumption there is a primitive recursive p(y, z) such 
that: 

R(y, z) & (p(y, z) = 0) 
Using the recursion-schema (2) we now define a function x(Vi z) as follows: 

x(o,2) = o 

X (n + 1, z) = (n + 1) • A + x (n, z) ■ a(A) 
where A = a(a(p(0, z))) ■ a(^p(n + 1, zu ■ a(y(n, zfj . 

A, which makes use of the above defined a and of the fact that a product is 0 if one of its 
factors is 0, can be described by the following pseudo-code: 
A = if(p(0, z) = 0) 
then 0 

else if(p(n + l,z) / 0) 
then 0 

else ii(x(n, z) / 0) 
then 0 
else 1 

It is a nice example for how arithmetic can be used to emulate logics. 

Therefore, + 1, z) is either = n+1 (if A — 1) or = x{ n i (if A = 0). Obviously, 
the first case will occur if and only if all factors of A are 1, i.e. if we have 
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-ii2(0, z) A i2(n + 1, z) A (x(n, *) = 0). 



This implies that the function x{ n -> z) (viewed as a function of n) remains 0 up to 
the smallest value of n for which R(n, z) holds, and has that value from then on (if 
R(0, z) already holds then xi n i %) is correspondingly constant and = 0). Therefore, we 
have 



It is easy to reduce the relation T to a case analogous to that of S by negation. 
This concludes the proof of theorem IV. 

2.4 Expressing metamathematical concepts 

As one can easily convince oneself, the functions x + y, x ■ y, x y and furthermore the 
relations x < y and x = y are primitive recursive. For example, the function x + y can 
be constructed as 0 + y = y and (k + 1) + y = succ(k + y), i.e. tp(y) = y and y(k,l,y) = 
succ{l) in schema (2). Using these concepts, we will now define a sequence of functions 
(relations) 1-45, each of which is defined from the preceding ones by the methods given 
by theorems I through IV. In doing so, usually multiple of the definition steps allowed 
by theorems I through IV are combined in one. Each of the functions (relations) 1-45, 
among which we find for example the concepts "formula" , "axiom" , and "immediate 
consequence", is therefore primitive recursive. 

1. y\x^-3z<x.x = y- z 
x is divisible by y. 



S(x, z) 4=> R(ip(x, z),z). 




3. prFactor(0,x) = 0 




4. 0! = 1 



{n + 1)! = (n + 1) -n! 
5. nthPrime(0) = 0 
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6. item(n,x) = argminy < x . (^(prFactor(n, x) v \ x) A -^(prFactor(n, x) y+1 
item(n, x) is the nth item of the sequence of numbers associated with x (for n > 0 and 

n not larger than the length of this sequence). 

7. length(x) = argmin?/ < x . (prFactor(y, x) > 0 A prFactor(y + 1, x) = Oj 
length(x) is the length of the sequence of numbers associated with x. 

8. x o y = argmin^ < nthPrime(length(x) + length(y)) x+y . 

(Vn < length(x) . item(n, z) = item(n, x)) A 
(VO < n < length{y) . item(n + length(x), z) = item(n, y)) 
xoy corresponds to the operation of "concatenating" two finite sequences of numbers. 

9. seq(x) = 2 X 

seq(x) corresponds to the number sequence that consists only of the number x (for 
x > 0). 

10. paren(x) = seg(ll) o x o seq(13) 

paren(x) corresponds to the operation of "parenthesizing" (11 and 13 are associated 
with the primitive signs "(" and ")"). 



11. vtype(n, x) <^ ^313 < z < x . isPrime(z) A x = z n ^j An^O 



X IS a VARIABLE OF TYPE n. 

12. isVar(x) -v^ 3n < x . vtype(n, x) 

X is a VARIABLE. 

13. not(x) = seq(5) o paren(x) 
not(x) is the negation of x. 

14. or(x, y) = paren(x) o seq(7) o paren(y) 
or(x, y) is the disjunction of x and y. 

15. forall(x, y) = seq(9) o seq(x) o paren(y) 

forall(x, y) is the generalization of y by the variable x (provided that a; is a 
variable). 

16. succ_n(0, x) = x 

succ-n{n + 1, x) = seq(3) o succ-n(n, x) 
succ-n(n,x) corresponds to the operation of "prepending the sign 'swcc' in front of x 
for n times". 

17. numberin) = succ(n, seq(l)) 

numberin) is the number-sign for the number n. 

18. stype^x) <^> 3m, n < x . 

(m = IV vtype(l, m)) A x = succ_n(n, seq(m)) 
X is a SIGN OF TYPE ONE. 
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19. stype(n,x) (n = 1 A stype^xfj V 

(n > 1 A 3t> < x . (vtype(n, v ) A x = R(v 
x is a SIGN OF TYPE n. 

20. elFm(x) <w> 3y, 2, n < x . 

(stype(n, y) A stype{n + 1, z) A x — z o paren(y)) 
x is an elementary formula. 

21. op(x, y, z) 4^ (x — not(y)) V {x = or(y, z)) V (3v < x . isVar{y) A x = forall(v, y)) 

22. fmSeq(x) <^ (VO < n < length(x) . elFm(item(n, x)) V 

30 < p, q < n . opiitemin, x), item(p, x), item(q } x))^j 

A length(x) > 0 

a; is a sequence of formulae, each of which is either an elementary formula or is 
obtained from the preceding ones by the operations of negation, disjunction, 

or GENERALIZATION. 



(a;) 2 ' 



x-(length(x)) 2 



23. isFm(x) 3n < (nthPrime(length(o 

fmSeq(n) A x = item(length(n) , n) 
x is a formula (i.e. the last item of a sequence n of formulae). 

24. bound(v, n, x) ^ isVar{y) A isFm(x) A 

3a, 6, c < x . x = ao forall(v, b) o c A isFm(b) A 

length(a) + 1 < n < length(a) + length(forall(v,b)) 
The variable v is bound in x at position n. 

25. free(v,n,x) <^ zsVar(f) A isFm(x) A 

d = item(n, x) An < length{x) A -^bound(v, n, x) 
The variable i> is free in x at position n. 

26. free(v, x) <^ 3n < length(x) . free(v, n, x) 
v occurs in x as a FREE VARIABLE. 

27. insert(x,n,y) = argmin^ < (nthPrime(length(x) + length(y))) x+y . 

3m, t> < x . 

x = u o seq(item(n, x)) ovAz = uoyovAn = lengthiu) + 1 
You obtain insert(x, n, y) from x by inserting y instead of the nth item in the sequence 
x (provided that 0 < n < length(x)). 

28. freePlace(0,v,x) = argminn < length(x) . 

free(v, n, x) A ->3n < p < length(x) . free(v, p, x) 
freePlace(k + 1, v , x) = argminn < freePlace(n, k, v) . 

free(v, n, x) A -i3n < p < freePlace(n, k, v) . free(v, p, x) 
freePlace(k, v, x) is the k + 1st place in x (counted from the end of formula x) where 
v is free (and 0 if there is no such place). 



[184] 
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29. nFreePlaces(v , x) = argminn < length(x) . freePlace(n,v,x) = 0 
nFreePlaces(v, x) is the number of places where v is free in x. 

30. subst' (0, x, v, y) = x 

subst' \k + l,x,v,y) = insert(subst'(k,x,v,y),freePlace(k,v,x),y) 

31. subst(x,v,y) = subst' \nFreePlaces(v , x) , x , v , y) 
subst(x,v,y) is the above defined concept substct^)- 

32. imp(x,y) = or(not(x),y) 
and(x,y) = not(or(not(x) , not(y))) 
equiv(x, y) = and(imp(x, y), imp(y, x)) 
exists(v,y) = not(forall(v, not(y))) 

33. typeLift(n, x) = argminy < x x ™ . 

Vfc < length(x) . 

item(k, x) < 13 A item(k, y) = item(k, x) V 

item(k,x) > 13 A item(k,y) = item(k,x) ■ prFactor(l, item(k,x)) n 
typeLift(n,x) is the nth type-lift of x (if x and typeLift(n,x) are formulae). 

There are three specific numbers corresponding to the axioms I, 1 to 3 (the Pea- 
no axioms), which we will denote by pa 1 ,pa 2 ,pa 3 , and we define: 

34. peanoAxiom(x) (x = pa x V x = pa 2 V x = pa 3 ) 

35. proplAxiom{x) ^ 3y < x . isFm{y) Ax = imp(or(y, y),y) 

x is a formula that has been obtained by inserting into the axiom schema II, 1. We 
define prop2Axiom(x) , prop3Axiom(x) , and prop4Axiom(x) analogously. 

36. propAxiom(x) <^ prop 1 Axiom(x)V prop2Axiom{x)\J prop3Axiom[x)\J prop4Axiom(x) 
x is a formula that has been obtained by inserting into on of the proposition axioms. 

37. quantorlAxiomCondition(z,y,v) -Gn < length(y),m < length(z),w < z. 

w = item{m, z) A bound(w, n, y) A free(v, n, y) 
z does not contain a VARIABLE that is BOUND anywhere in y where v is FREE. This 
condition for the applicability of axiom III, 1, ensured that a substitution of z for the 
free occurrences of v in y does not accidentally bind some of z's variables. 

38. quantorlAxiom(x) 3v, y, z,n < x . 

vtype(n,v) A stype(n, z) A isFm{y) A quantorlAxiomCondition(z,y,v)A 
x = imp{jorall{y , y), subst(y,v, z)) 
a; is a FORMULA obtained by substitution from the axiom schema III, 1, i.e. one of the 
quantor axioms. 

39. quantor 2 Axiom(x) <S> 3v, q,p < x . 

isVar{y) A isFm{p) A ~^free(v,p) A isFm(q) A 
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x = imp(forall(v, or(p, q)), or(p, forall{v, q))) 
x is a FORMULA obtained by substitution from the axiom schema III, 2, i.e. the other 
one of the quantor axioms. 

40. reduAxiom(x) 3u, v,y,n < x . 

vtype(n, v) A vtype{n + 1, u) A ->free(u, y) A isFm(y) A 
x = exists(u, foralliy , equiv(seq{u) o parents eq(v)),y))) 
a; is a FORMULA obtained by substitution from the axiom schema IV, 1, i.e. from the 
reducibility axiom. 

There is a specific number corresponding to axiom V, 1, (the set axiom), which we 
will denote by sa, and we define: 

41. setAxiom(x) <^ 3n < x . x = typeLift(n, sa) 

42. isAxiom(x) ^ peanoAxiom(x) V propAxiom(x) V 

quantor lAxiom(x) V quantor 2 Axiom[x) V reduAxiom(x) V 
setAxiom(x) 
x is an axiom. 

43. immConseq(x, y,z) <^ y = imp(z, x) V 3v < x . isVar{v) A x = foralliy, y) 
x is an immediate consequence of y and z. 



44. isProofFigure(x) <^ (V0 < n < length(x) . 

isAxiom(item(n, x)) V 30 < p,q < n . 

immConseq(item(n, x), item(p, x), item(q, x))^j A 
length[x) > 0 

a; is a proof figure (a finite sequence of formulae, each of which is either an 
axiom or the immediate consequence of two of the preceding ones). 

45. proof For(x , y) <S> isProofFigure(x) A item(length(x) , x) = y 
x is a proof for the formula y. 

46. provable(x) ^ 3y . proofFor(y, x) 

a; is a provable formula. (provable(x) is the only one among the concepts 1-46 for 
which we can not assert that it is primitive recursive). 

2.5 Denotability and provability 

The fact that can be expressed vaguely by: Every primitive recursive relation is defina- 
ble within system P (interpreting that system as to content), will be expressed in the 
following theorem without referring to the interpretation of formulae of P: 

Theorem V: For every primitive recursive relation R(x±, . . . , x n ) there is a relation 
SIGN r (with the free variables u±, . . . , u n ), such that for each n-tuple (x\, . . . , x n ) 
the following holds: 
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R(x\, . . . , x n ) =>■ provable(subst(r, u\ . . . u n , number{x\) . . . number(x n ))) (3) 
-iR(xi, . . . , x n ) =>- provable(not(subst(r, m . . . u n , number[x\) . . . number(x n )))) (4) 



We contend ourselves with giving a sketchy outline of the proof for this theorem here, 
since it does not offer any difficulties in principle and is rather cumbersome. We prove 
the theorem for all relations R(xi, . . . , x n ) of the form x± = <f>(x2, • • • , x n ) (where 0 is a 
primitive recursive function) and apply natural induction by 0's degree. For functions 
of degree one (i.e. constants and the function x + 1) the theorem is trivial. Hence, let 0 
be of degree m. It is built from functions of lower degree 0i, . . . , 0& by the operations of 
insertion and primitive recursive definition. Since everything has already been proven 
for 0i, . . . ,(f>k by the inductive assumption, there are corresponding relation signs 
7*1,. . . , Tfc such that (3), (4) hold. The definition processes by which 0 is built from 
0i, . . . , 0^ (insertion and primitive recursion) can all be modeled formally in system P. 
Doing this, one gets from 7*1, . . . , T}. a new relation sign r for which one can proof the 
validity of (3), (4) without difficulties. A relation sign r associated with a primitive 
recursive relation in this manner shall be called primitive recursive. 

2.6 Undecidability theorem 

We now come to the goal of our elaborations. Let k be any class of formulae. We 
denote with Conseq(n) the smallest set of formulae that contains all formulae of 
K and all axioms and is closed under the relation "immediate consequence", k is 
called u-consistent if there is no CLASS-SIGN a such that 

{\/n . subst(a,v, number(n)) G Conseq(K,)^ A not(Jorall(y , a)) G Conseq{n) 

where V is the FREE VARIABLE of the CLASS-SIGN a. With other words, a witness 
against w-consistency would be a formula a with one free variable where we can derive a{n) 
for all n, but also -A/re . a(n), a contradiction. 

Every ^-consistent system is, of course, also consistent. The reverse, however, does 
not hold true, as will be shown later. We call a system consistent if there is no formula 
a such that both a and ->a are provable. Such a formula would be a witness against the 
consistency, but in general not against the w-consistency. With other words, w-consistency is 
stronger than consistency: the first implies the latter, but not vice versa. 

The general result about the existence of undecidable propositions goes as follows: 

Theorem VI: For every u-consistent primitive recursive class k of formulae there 
is a primitive recursive CLASS-SIGN r such that neither forall(v , r) nor not(forall(v , r)) 
belongs to Conseq(n) (where v is the free variable ofr). 
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Since the premise in the theorem is w-consistency, which is stronger than consistency, the 
theorem is less general than if its premise were just consistency. 

Proof: Let k be any ^-consistent primitive recursive class of FORMULAE. We define: 



isProo j "Figure K (x) 

(\/n < length(x) . isAxiom(item(n, x)) V (item(n,x) £k)V 

30 < p, q < n . immedConseq(item(n, x), item(p, x), item(q, x))j A 
length(x) > 0 

(compare to the analogous concept 44) 

proof For K (x , y) -v=> isProofFigure K (x) A item(length(x) , x) = y (6) 
provable K (x) <^ By . proofFor K (y , x) (6.1) 

(compare to the analogous concepts 45, 46). 
The following obviously holds: 

Vx . (provable K (x) <S> x G Conseq(K)^j , (7) 
Vx . (provable(x) =>■ provable K (x)) . (8) 



Now we define the relation: 

Q(x,y) -v=> -i (proof For K (x, subst(y, 19, number(y)))j . (8.1) 

Intuitively Q(x,y) means x does not prove y{y). 

Since proof For K (x , y) (by (6), (5)) and subst(y , 19 , number(y)) (by definitions 17, 
31) are primitive recursive, so is Q(x,y). According to theorem V we hence have a 
RELATION SIGN q (with the FREE VARIABLES 17, 19) such that the following holds: 



~^proofFor K (x, subst(y, 19, number{y))) =>■ 
provable K (subst(q, 17 19, numberix) number(y))) 

proofFor K (x, subst(y, 19, number(y))) =>- 
provable K (not{subst(q, 17 19, numberix) number{y)))) . 



(9) 
(10) 



We set: 



p = forall{YI,q) (11) 

(p is a CLASS-SIGN with the FREE VARIABLE 19 (which intuitively means 19(19), i.e. 
y(y), is improvable)) and 

r = subst(q, 19, number[p)) (12) 
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(r is a primitive recursive CLASS-SIGN with the FREE VARIABLE 17 (which intuitively 
means that 17, i.e. x, does not prove p(p), where p{p) means p{p) is unprovable)). 
Then the following holds: 

subst(p, 19, number(p)) = subst(forall(17 , q), 19, number(p)) 

— forall(17, subst(q, 19, number(p))) (13) 
= forall(17, r) 

(because of (11 and 12)); furthermore: 

subst(q, 17 19, number(x) number(p)) = subst(r, 17, number(x)) (14) 

(because of (14)). The recurring forall(17,r) can be interpreted as there is no prove 
for p(p), with other words, forall(17,r) states that the statement p{p) that states its own 
improvability is improvable. If we now insert p for y in (9) and (10), we get, taking (13) 
and (14) into account: 

->proofFor K (x , forall(17 , r)) =>- provable K (subst(r, 17, number(x))) (15) 
proof For K {x , forall(17 , r)) =>■ provable K (not(subst(r, 17, number(x)))) (16) 
This yields: 

1. forall(17,r) is not k-provable. Because if that were the case, there would (by 
(7)) exist an n such that proof For ^{n, forall{17 ,r)) . By (16) we would hence have: 

provable K (not(subst(r, 17, number(n)))) , 

while on the other hand the k-provability of forall(17,r) also implies that of 
subst(r, 17, number(n)) . Therefore k would be inconsistent (and in particular cj- 
inconsistent). 

2. not(forall(17 , r)) is not k-provable. Proof: As has just been shown, forall(17, r) 
is not k-provable, i.e. (by (7)) we have 

Vra . -iproofFor K (n, forall(17 , r)). 

This implies by (15) 

Vn . provable K (subst(r, 17, number(n))) 
which would, together with 

provable K (not(forall(17, r))), 

contradict the ^-consistency of K. 
Therefore forall(17,r) is not decidable from k, whereby theorem VI is proved. 
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2.7 Discussion 



One can easily convince oneself that the proof we just did is constructive, i.e. it the 
following is intuitionistically flawlessly proven: 

Let any primitive recursively defined class k of formulae be given. Then if the formal 
decision (from k) of the proposition- formula forall(17,r) is also given, one can 
effectively present: 

1. A proof for not(forall(17,r)). 

2. For any given n a proof for sub st(r, 17, number(n)), i.e. a formal decision for 
forall(17,r) would imply the effective presentability of an tu-inconsistency-proof. 

Let us call a relation (class) between natural numbers R(xi, . . . , x n ) decision- definite 
if there is an n-ary RELATION SIGN r such that (3) and (4) (c.f. theorem V) hold. In 
particular therefore every primitive recursive relation is by Theorem V decision-definite. 
Analogously, a relation sign shall be called decision- definite if it corresponds to a 
decision-definite relation in this manner. For the existence of propositions undecidable 
from k it is now sufficient to require of a class k that it is c^-consistent and decision- 
definite. With other words, it is not even important how the class of added axioms k is 
defined, we just have to be able to decide with the means of the system whether something 
is an axiom or not. This is because the decision-definiteness carries over from k to 
poofFor K (x,y) (compare to (5), (6)) and to Q(x,y) (compare to (9)), and only that 
was used for the above proof. In this case, the undecidable theorem takes on the form 
forall(v,r), where r is a decision-definite CLASS-SIGN (by the way, it is even sufficient 
that k is decision-definite in the system augmented by k). 

If instead of tu-consistency we only assume consistency for k, then, although the 
existence of an undecidable proposition does not follow, there follows the existence of a 
property (r) for which a counter-example is not presentable and neither is it provable 
that the relation holds for all numbers. Because for the proof that forall(17,r) is 
not tu-PROVABLE we only used the ^-consistency of k (compare to page 189), and 
->provable K (forall(17,r)) implies by (15) for each number x that subst(r, 17, number(x)) 
holds, i.e. that for no number not(subst(r, 17, number(x))) is provable. 

If you add not(forall(17 , r)) to k you get a consistent but not tu-consistent class of 
formulae k' . k' is consistent because otherwise forall(17,r) would be provable. But 
k' is not tu-consistent, since because of ->provable K (forall(17,r)) and (15) we have 

Wx . provable K (subst(r, 17, number(x))) , 
and hence in particular 

Vx . provable K ,(subst(r, 17, number(x))) , 
and on the other hand of course 
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provable K ,(^forall(17, r)). 

But that means that forall(17,r) precisely fits the definition of a witness against in- 
consistency. 

A special case of theorem VI is the theorem where the class k consists of a finite 
number of formulae (and perhaps the ones derived from these by type-lift). Every 
finite class k is of course primitive recursive. Let a be the largest contained number. 
Then we have for k in this case 

x G K -v^ 3m <x,n<a.n£K>Ax = typeLift(m, n) 

Hence, k is primitive recursive. This allows us to conclude for example that also with 
the help of the axiom of choice (for all types) or the generalized continuum hypothesis 
not all propositions are decidable, assuming that these hypotheses are cj-consistent. 

During the proof of theorem VI we did not use any other properties of the system 
P than the following: 

1. The class of axioms and deduction rules (i.e. the relation "immediate conse- 
quence") are primitive recursively definable (as soon as you replace the basic 
signs by numbers in some way). 

2. Every primitive recursive relation is definable within the system P (in the sense 
of theorem V). 

Hence there are undecidable propositions of the form Va; . F(x) in every formal 
system that fulfills the preconditions 1, 2 and is cu-consistent, and also in every extension 
of such a system by a primitive recursively definable, w-consistent class of axioms. To 
this kind of systems belong, as one can easily confirm, the Zermelo-Fraenkelian axiom- 
system and the von Neumannian system of set-theory, furthermore the axiom-system 
of number-theory which consists of the Peano axioms, primitive recursive definition (by 
schema (2)) and the logical deduction rules. Simply every system whose deduction rules 
are the usual ones and whose axioms (analogously like in P) are made by insertion into 
a finite number of schemas fulfills precondition 1. 

3 Generalizations 

— omitted — 

4 Implications for the nature of consistency 

— omitted — 
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A Experiences 



This translation was done for a reason. I took Mike Eisenberg's class "Computer 
Science: The Canon" at the University of Colorado in fall 2000. It was announced as 
a "great works" lecture-and-discussion course, offering an opportunity to be pointed 
to some great papers, giving an incentive to read them, and providing a forum for 
discussion. This also explains my motivation for translating Godel's proof: it is a truly 
impressive and fascinating paper, there is some incentive in completing my final paper 
for a class, and this exercise should and did benefit me intellectually. Here, I will try 
to share the experiences I made doing the translation. I deliberately chose a personal, 
informal style for this final section to stress that what I write here are just my opinions, 
nothing less and nothing more. 

A.l Have I learned or gained something? 

First of all, how useful is it to read this paper anyway, whether you translate it or not? 
One first answer that comes to mind is that the effort of understanding it hones abstract 
thinking skills, and that some basic concepts like Peano axioms, primitive recursion, 
or consistency are nicely illustrated and shown in a motivated context. The difficulty 
with this argument is that it is self-referential: we read this paper to hone skills that 
we would not need to hone if we would not read this kind of papers. I don't really have 
a problem with that, people do many things for their own sake, but fortunately there 
are other gains to be had from reading this paper. One thing that I found striking is 
how I only fully appreciated the thoughts from section 2.7 on re-reading the proof. I 
find it fascinating just how general the result is: your formal system does not need to 
be finite, or even primitive recursively describable, no, it suffices that you can decide 
its set of axioms in itself. I am not sure whether other people are as fascinated by this 
as I am; if you are not, try to see what I mean, it's worth the effort! But in any case, I 
have gained an appreciation for the beauty and power of the results, which I am happy 
for. Finally, there is some hope that the writing skills, proof techniques, and thought 
processes exhibited by this paper might rub off, so to speak. Part of learning an art is 
to study the masters, and Godel was clearly a master in his art! 

Second, how useful was the translation itself? Well, it was useful to try out some 
ideas I had about how translating a technical paper between languages might work. My 
recipe was to first read and understand the whole paper, then translate it one sentence 
at a time (avoid to start translating a sentence before having a plan for all of it!), and 
finally to read it fast to check the flow and logic. This might or might not be the 
best way, but it worked well enough for me. For understanding the paper itself, the 
translation between languages or the use of hyper-text as a medium did not help me 
much as I was doing it. More important was the translation of notation to one I am 
more used to, and the occasional comment to express my view of a tricky detail. Last 
but not least, it seems hardly necessary to admit my strong affection for type-setting, 
and there is a certain pleasure in looking over and polishing something you crafted that 
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I believe I share with many people. 

A. 2 Has the paper improved? 

The original paper is brilliant, well- written, rich of content, relevant. Yet I went ahead 
and tinkered around, changing a little thing here and there, taking much more freedom 
than the translators for [ ; , L]. Yes, the paper did improve! It is closer to my very 
personal ideas of what it should ideally look like. To me who did the changes just a 
few days ago the modified paper looks better than the original. 

In section 0, I announced a translation along three dimensions, namely language 
(from German to English), notation (using symbols I am more used to) and medium 
(exploiting hyper-text). Let us review each one in turn and criticize the changes. 

Language. After finishing the translation, I compared it with the ones in [2, 1], and 
found that although they are different, the wording probably does not matter all 
that much. To give an example where it did seem to play a role, here are three 
wordings for besteht eine nahe Verwandtschaft: (i) is closely related, (ii) is also 
a close relationship, (iii) is also a close kinship. The third one is mine, and my 
motivation for it is that it is the most punchy one, for what it's worth. The 
stumbling blocks in this kind of translation, as I see it, are rather the technical 
terms that may be in no dictionary. For example, I could well imagine that 
my translation decision- definite seems unnatural to someone studying logic who 
might be used to another term. 

Notation. This is the part of the translation that I believe helps the most in making 
the paper more accessible for readers with a similar educational background as 
mine. For example, I have never seen "II" used for "V", and inside an English 
text I find u bound(v,n,x)" easier to parse than Ll v Gebn,x" . 

Hyper-text. During the translation, it became clearer to me just how very "hyper- 
text" the paper already was! On the one hand, the fact that one naturally refers 
back to definitions and theorems underlines that hyper-text is a natural way of 
presentation. On the other hand, the fact that one gets along quite well with a 
linear text, relying on the readers to construct the thought-building in their own 
head, seems to suggest that the change of medium was in fact rather superficial. 
I would be interested in the opinions of readers of this document on this: did the 
hyper-text improve the paper? 

Clearly, the most important aspects of the paper are still the organization, writing, 
and explanation skills of Godel himself. And clearly, the paper is still an intellectual 
challenge, yielding its rewards only to the fearless. To assume that my work has changed 
either of these facts significantly would be presumptuous. 
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A. 3 Opinions 

This discussion is a trade-off between being careful and thoughtful on the one hand, 
and being forthcoming and fruitful on the other. As it leans more to the second half 
of the spectrum, I might as well go ahead and state some opinions the project and the 
reflections upon it have inspired in me. 

• A well-written technical paper already has the positive features of hyper-text. 
This may not seem so at first glance, but compare it to the typical web-page 
and then ask yourself which has more coherence. To me, coherence is part of the 
essence and beauty of cross-referencing. 

• There is an analogy between writing papers and computer programs, and it is 
amazing how far you can stretch it without it breaks down. The skill of gradually 
building up your vocabulary, dividing and conquering the task in a clean and 
skillful way, and commenting on what you do are all illustrated nicely by Godel's 
proof. 

• Reading and understanding Godel's proof yields many benefits. There are pearls 
to be found in its contents, and skills to be practiced that go beyond what one 
might think at first glance. 

I am well aware that I did not give many arguments to support these opinions. 
That would be the stuff for a paper by itself, and the reader is encouraged to think 
about them. But above all, enjoy the paper "On formally undecidable propositions of 
Principia Mathematica and related systems I" itself, which after all makes up the main 
part of this document! 
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